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Abstract. Let Xn be a N x N matrix whose entries are i.i.d. complex random variables with mean 
zero and variance A . We study the asymptotic spectral distribution of the eigenvalues of the covariance 
matrix X^Xn for N oo. We prove that the empirical density of eigenvalues in an interval [E, E + rj\ 
converges to the Marchenko-Pastur law locally on the optimal scale, Nr//\/E 3> (logiV) 6 , and in any 

interval up to the hard edge, ^ftffl < E < 4 — k, for any k > 0. As a consequence, we show the 
complete derealization of the eigenvectors. 

1. Introduction 

Let X be a N x M matrix with entries x^j = Re Xjj + i Im Xij . We assume that Re Xij and Im Xij are 
independent identically distributed real random variables with mean zero and variance 1/2 so that 

Exij = and E\xij\ 2 = 1 i = 1, . . . , JV, j = 1, . . . , M , 

In what follows we shall denote by Xjy the scaled matrix 

(i.i) x N = x/Vn. 

We denote by v the probability distribution of Rexij and Im^. Let s a , a = 1,...,N, be the 
eigenvalues of X* N X^. Since X* N X^ is positive definite we can assume that < s\ < S2 < • • • < sjv- 
The results of this paper extend easily to X^ having real entries; to simplify the notation, we will 
consider in the following only the case of complex entries. 
Assume d = \\m.N/M > and let 

Marchenko and Pastur showed in |19j the convergence of the density of the eigenvalues si,...,sjv 
towards the Marchenko-Pastur law 



(1-2) Pmp(E) - 



2tt\ E 2 

whenever E G [A_, A+] and otherwise. In this paper, we will be interested in the case d = 1. In this 
case the Marchenko-Pastur law is supported on the interval [0, 4] and is given by 



It has therefore a E singularity close to the origin E = 0. This reflects the fact that the typical 
distance between eigenvalues is of order y/E/N rather than 1/N, as it is in the bulk; for this reason, 
E = is known as the hard edge of the sample covariance matrix X^Xn (soft edges are instead 
characterized by the fact that the typical distance between neighbouring eigenvalues is larger than in 
the bulk). While the result of |19] determines the convergence to (|1.2p on intervals of order one, con- 
taining typically order N eigenvalues, in the present paper we establish the convergence of the density 
of states locally, on intervals containing typically a bounded number of eigenvalues, independent of 
N. In particular, we consider intervals close to the hard edge E = 0. As a direct consequence of the 
local validity of the Marchenko-Pastur law, we obtain the complete derealization of the eigenvectors 
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associated to eigenvalues up to the edge. A further possible application of our results consists in 
establishing the universality of the local eigenvalue correlations close to the hard edge; this can be 
obtained following the receipt of [7], making use of the result of [2J, in the case of complex entries, 
or similarly to [12t 113] . using the method of the local relaxation flow, for both Xjy having real or 
complex entries. We observe, however, that the universality of the local eigenvalue correlations close 
to the hard edge (where they can be described in terms of the so called Bessel kernel) has already 
been established, using a different approach, in [22] , 

In the last years, a lot of progress was achieved in the spectral analysis of random matrices. Local 
convergence of the density of states of Wigner matrices to the semicircle law and derealization of the 
eigenvectors has been established in [9l [10l [TTJ [15] . Universality of the local eigenvalue correlations 
was proven for Wigner ensembles with arbitrary symmetry (real symmetric, hermitian, or quaternion 
hermitian ensembles) in [12[ 115]. This result was obtained by the introduction of the local relaxation 
flow, a flow for the eigenvalues of the Wigner matrix with the property of fast relaxation to equilibrium 
(and such that, locally, it remains close to the Dyson Brownian motion described by the eigenvalue 
when the entries are evolved by independent Brownian motions). For ensembles of hermitian Wigner 
matrices, universality was proven earlier in [T[ I23|. [8] . In all these proofs of universality, the local con- 
vergence of the density of states was a crucial ingredient. Universality at the edge of Wigner matrices 
was proven in [21] and more recently in [24|. [U \TE[ \T7\ . For sample covariance matrices with < d < 1 , 
local convergence to the Marchenko-Pastur law and universality of the local eigenvalue correlations 
were determined in the bulk [1'6\ |2"5] and at the soft edge [26 , IJ [2D] . More recently, local convergence of 
the density of states and derealization results have also been obtained for more structured ensembles, 
such as the adjacency matrices of Erdos-Renyi graphs [HE] and band matrices [BJ. In this paper, we 
focus on the hard edge of sample covariance matrices, proving the local convergence of the density 
of states to the Marchenko-Pastur law on the optimal scale (up to logarithmic corrections). As a 
consequence, we obtain complete derealization of the eigenvectors associated with eigenvalues close 
to the hard edge. 

After the completion of our work, we learned that, independently from us, Bourgade, Yau and Yin 
study in [3] the convergence of the density of the eigenvalues of a random matrix X with no symmetry 
constraints towards the circular law, on optimal scales. The basic ingredient of their proof is the study 
of the spectrum of the hermitization {X — z)*(X — z). In particular, for z = 0, they obtain results 
similar to ours for the eigenvalues of sample covariance matrices. 

An important object used in the proof of the local validity of the Marchenko-Pastur law is the Stieltjes 
transform defined for any 9 6 C with Im 9 > by 



In a similar way, one defines A(9) to be the Stieltjes transform of the Marchenko Pastur distribution. 
In the case d = 1 that will be considered in this paper 



Local convergence towards the Marchenko-Pastur law follows from the convergence of A^v towards A. 
To simplify our analysis, we will assume that v has subgaussian decay, i.e., that there exists 5o > 
such that 



This condition is needed to apply a version of a theorem of Hanson and Wright as formulated in 
|11[ Prop. 4.5], see also Proposition 12.21 below. At the price of getting weaker convergence rates, 
this assumption can be substantially relaxed (existence of sufficiently high moments is sufficient). 
Furthermore, we assume that probability density function of the real and imaginary parts of the 
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entries is bounded; this will simplify the study of the eigenvalues located very close to the origin. Also 
in this case, improvements are certainly possible. 



Our first result is a proof of a bound on the number of eigenvalues s a in a window / = [E,E + 77], 
valid up to the hard edge and for small rj (s.t. Nrj/y/E 3> (log N) b , b > 2). The proof of the following 
theorem can be found in Section [3j 

Theorem 1. Let Xn be a N x N matrix as described in (jl.ip . whose entries satisfy il.5\) . Let 
I = [E,E + rj] with Nrj/y/E > (log N) b , for some b > 2. Denote by Mi the number of eigenvalues of 
X^Xn in I. Then there exist constants c, C, Kq > such that, for any K > Kq and N large enough, 



(1.6) 



Ni>K 



Nr] 



< e 



IK 



77 AT 



Using the a priori bound in Theorem[T]we prove the convergence of the Stieltjes transform A^{E + iri) 
of the sample covariance matrices towards the Stieltjes trasnform A(E -\-in) of the Marchenko-Pastur 
law, up to the hard edge, and close to the real axis. 

Theorem 2. Let Xn be a N x N matrix as described in (jl.ip . whose entries satisfy U.5\) . Assume 
moreover that the probability density function of the real and imaginary part of the entries is bounded. 
Moreover set 9 = E + in, with E < A — k, < n < kE, Nrj/VE > (log N) b , for some b > 4 and 
< k < 1 (these bounds also imply that E > (log N) 2b /k 2 N 2 ). Then there exist Eq > 0, c > such 
that for < e < and N large enough, 



A N (e)-A(9)\ > 



< e - C£ V7§ +e -c(log7V) f '/ 4 _ 



The proof of this theorem is in Section The convergence of the Stieltjes transform immediately 
implies the convergence of the density of states. 

Theorem 3. Let Xjy be a N x N matrix as described in (jl.ip . whose entries satisfy \ 1.5\) . Assume 
moreover that the probability density function of the real and imaginary part of the entries is bounded. 
Suppose E < 4 — k, < r] < kE, Nrj/yE > (logiV) 6 , for some b > 4 and < k < 1, and let 
I = [E, E + rj\ . Then there exist Eq > 0, C, c > such that for < e < Sq and N large enough, 



(1.7) 



A// 



dsp(s) 



> 



< Ce 



•Je _|_ Ce c " og N > 



b/4 



The proof of Theorem [3] can be obtained from Theorem [21 similarly as in [9l Cor. 4.2]. Finally, 
Theorem [3] implies complete derealization of the normalized eigenvectors of X^Xjy associated with 



eigenvalues in the window 



(log AT)" 
k 2 N ,j 



, for any k > 0. 



Theorem 4 (Derealization). Let Xn be a N x N matrix as described in (jl.ip . whose entries satisfy 
\1.5\) . Assume moreover that the probability density function of the real and imaginary part of the 
entries is bounded. Fix < k < 1, 6>4. Then there exist constants c, C > such that and for N 
large enough, 
(1.8) 



3 u s.t. XtrXiyu = su . 



u 



1 , s G 



(log N) 

K 2 N 2 



21, 



and 



u 



> C 



(logAQ 



< e 



-c(logJV)* 



2. Basic definitions and results 
In this section we collect several definitions and results which will be used to prove the main theorems. 



2.1. A formula for Ajy. The proofs of Theorems [T] and [2] rely on the following formula for the 
diagonal components of the resolvent (X^Xjq — 9)~ l (see [T3] ) 

((X^Xtv — 0) ^uu = Zi 

\w k \ 2 -0--w%W k (W%W k -6) 'W^Wfc 
(2-1) 1 

#(l + w* (WfeWjf-^^Wfc 



where = x^/viV is the fc-th column of the matrix Xn and W k denotes the N x (N — 1) matrix 
obtained by removing the £;-th column from the matrix Xn (notice that Wj*W k is the (N— 1) x (JV— 1) 
minor of X*^Xn, obtained by removing the fc-th row and the fc-th column). We used here the well- 
known identity 

W k (W* k W k - 0)~ l W* k = W* k W k (W k W* k - 0)- 1 

valid for Im 6^0, which can be proved using the Neumann expansion of the resolvent. Eq. (|2.1|) 
gives the following formula for the Stieltjes transform A^(9): 

1 N 

k=l 



1 



N (l + w* {W k W* - 0) X w fc 

2.2. Properties of A. We collect here some properties of the Stieltjes transform A(0) of the Marchenko- 
Pastur distribution pmp, defined by 



Mm ^ <fa _i ?m ,(x) = -i + ^i-J 



where we use the branch of the square root with Re^l - 4/0 > 0. We use the fixed point equation 
(2-3) 0(A(0) + 1) = -^-. 

Lemma 2.1. Let = E + irj, with n > 0. Then 

1 ( E 

(2.4) |A(6»)| 2 < — and |1 + A(0)\ 2 > max 



E 1 \E 2 + n 2 ' 4 

Moreover 

|£7 _ 4 |l/2 + 1/2 

(2.5) 7mAW > c L_J_±^_ 

if E 2 + rj 2 < AE (this condition defines a circle of radius 2 around (E,rj) = (2,0)). 
Proof. From (I2.3p . taking the imaginary part, we get 

(2.6) ImA(l- \A\ 2 E) =r/Re(l + A)|A| 2 . 

Eq. ([22]) implies that Re(l + A) > 0. Since ImA(E + irj) > for n > 0, together with (j2"l)|) . 
this implies the first bound in (|2.4|) . To get the second bound in (|2.4|) we first notice that by (|2.2p . 
Re(l + A) > 1/2. This implies immediately that |1 + A| 2 > 1/4. The bound |1 + A| 2 > E/(E 2 + rj 2 ) 
follows instead from (|2.3|) . combined with |A| 2 < 1/E. 
To show (|2.5|) . we observe that, from (|2.2|) . 

|^_ 4 |l/2 +r? l/2 



ImA=ilm Jl-I>\< 



1 



> C 



( J B 2 +7 ? 2) 1 /4 



under the assumption that E 2 + rj 2 < 41?. Here we used the fact that Im yfz > \z\ l / 2 /\f2, if Re z < 
and Im z > 0. 

□ 
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2.3. Large deviations of quadratic forms. We will make use of the following inequality for the 
fluctuations of quadratic forms, due to Hanson and Wright. For the proof of the next proposition we 
refer to [HJ Prop. 4.5], see also [HI App. B] and [IE] . 

Proposition 2.2. For j = 1, . . . , N let Xj = Re xj + ilm xj, where {Re xj, Im Xj}^ =1 is a sequence 
of2N real iid random variables, whose common distribution satisfies (jl.5p . Let A = (a^) be a N x N 
complex matrix. Then there exist constants c, C > such that, for any 5 > 

N 

^^^^^ CLq^j ^X ^X ^ lHl^Z- 2, X j 



> S < (7 e - cmin {<Vv A f7AM,5 2 /TrA*A} 



The following proposition is a consequence of the Hanson- Wright inequality. Its proof can be found, 
for example, in [11] , 

Proposition 2.3. Let {v a }™ =1 be a set of m orthonormal vectors in C N and x = (xi, . . . ,xjy) S 
where {Re Xj,Im Xj}^ =1 are 2N real iid random variables satisfying il.5\) such that Kxj = 0, and 
E|xj| 2 = 1. Then there exist two constants c, C > such that 

v(^\x-v a \ 2 <^<Ce~^. 

3. Upper bound for the number of eigenvalues: Proof of Theorem Q] 

Recall that / = [E, E + rj\, and that Mi denotes the number of eigenvalues of the matrix X* N X^ in /. 
We have 

N N 2 N -i 

Mi = Y, e[E,E + r,))<Cj2 (a _ 1 2 . o = Cijim £ 



J (s a - E) 2 + rj 2 s a - E - in 

N 

Cnlm Tr (X* N X N -E- ii^ 1 = Cnlxn ^(^N^N -E- ir,)^ 



k=l 

N 



-Cnlm 



k=i0(i + w* k (w k w*-e) 

where we put 6 = E + in and we used (|2.ip . Using the spectral decomposition of WkW£, we find 

N 1 
Mi < -Crflm ^ 



k=l V - l^ a=0 \W k V a I + 2^ Q =1 7fc)77 

N 1 



1 i V^JV— J 

*=1 1 + £a=l 
JV 2 T/ 2 1 " 



(s^— -E) 2 +»7 2 



A- 



where, in the second inequality, we used the fact that |Im (l/z)\ < l/|Im z\. Setting K = 2C 1//2 , it 
follows that 

Mi<K N " 



'E 

unless there exists k € {1, ... , N}, with 



E ^K-v( fc )| 2 <^ 



a:sL fe) e[E,E+r;] 



In other words, 



P [Mi > K 



Nrj 



( 



Ni>K^± and3Jfee{l,...,tf} with Yl N\w k ■ v^\ 2 < J -^j- 



Since W£Wk is a minor of X^Xn, it follows that its eigenvalues are interlaced between the eigenvalues 
of XyXjs;. This implies that 

{a:s^e[E,E + r,]}\>Mi-l>^ 
on the event we consider. Proposition 12.31 (applied with E = 1, 77 = 0) implies therefore that 

ivy 



P [Ni > K 



< CNe 



^ <Ce 



'V 



after adjusting the constants. This concludes the proof of Theorem [TJ 

4. An estimate for the number of eigenvalues close to zero 

In this section, we show that, with high probability, there cannot be too many eigenvalues at distances 
smaller than 1/N 2 from 0. To this end, we need the boundedness of the probability density function 
of the entries. We use here the notation J\f[a, b] to indicate the number of eigenvalues in [a,b]. 



Proposition 4.1. Let Xjy = (xij/yN) be a N x N matrix as described in equation (jl.ip . Assume 
that the probability density function h(x) of Re Xij and Im Xij is bounded. Then there exists a constant 
C, c > such that 



(4.1) P LA/" 

for all L > cK . 

Proof. We start with the observation that 



K 



> L I < Ce" 



P(A/"[0, K/N 2 } >L)= F{(M[0,K/N 2 ]/L) p > 1) < E(M[0, K/N 2 ]/L) p 



Next, we notice that 



CK 



N 



mK/N 2 }<j^Y. lm ( x N x N 



)kk 



k=l 



where we set 9 = KN 2 + iKN 2 . This implies that 



A/"[0, K/N 2 
I 



< 



CK 



' 1 N 

-^Tm(X* N X N 



>kk 



< 



k=l 



CKV / 1 



N 



NL 



N 



J2Qm(X* N X N -0)^y 



k=l 



by Holder inequality. Hence 



P(Af[0,K/N 2 ]>L)< f^|Y EQmTOAOv-^n 1 ) 



< 



< 



CK 



NL 
CKV 



E 



L 



E 



T^C Ixi -V (1) I2 XP 



where we neglected the real part of the denominator and we defined xi = VNw\ (it is a vector in 
C^, whose components have iid real and imaginary parts with zero mean and variance 1/2), and 

(4-2) c a = — ^ 

(N 2 s { a'/K -l) 2 + 1 

We recall that the eigenvalues s a and Sa^ are ordered in increasing order. On the event M[0, K/N 2 ] > 
L, the interlacing property implies that at least L — 1 eigenvalues of the minor are in the interval 
[0, K/N 2 ]; i.e. s { a } € [0, K/N 2 ] for a = 0, L—l. This implies that c a > 1/2 for all a = 1, . . . , L- 1. 
Therefore 



f r >CK\ p 1 
> < (— ) ^j— w 

V J [Ea=0 l X l ,V « I 

Next, we take p = (L — 2)/2. Since the matrix entries are assumed to have a bounded probability 
density function, Lemma A.l of [18] implies that 

E ";Efc;ix,v«p) ( — £C 

for C > indpendent of L. This concludes the proof of the proposition. □ 

5. Convergence of the Stieltjes transform: Proof of Theorem [2 
We start from the formula (|2.ip . rewritten as 



i 



N fi (l + A,v + % 
which immediately implies that 

N 



(52) A i = 1 y fWv£ 

N 0(A N + 1) N^0(i + A N + n k /VE)(l + A N ) 
Here we defined the error terms 

n k = VE (w* k (w k w^ - ey l ™ k - 1 Tr - 

+ V# (1 Tr (W k W* k - 6)~ l - 1 Tr(X^X^ - tf)- 1 ) 
Observe that, with probability one, 



(5.3) VE 
This follows from 



i Tr^Xjy - fl)" 1 - ^ Tr(WW fc * 



< C 



^ Tr(WW fc * - 0)- 1 = + 1 Tr(W fe * W fc - O)' 1 

and from the interlacing of the eigenvalues of Wj*W k between the eigenvalues of X^X^. To estimate 
the first difference in the error Q k , we notice that 

Ew fc w£(WW fc * - 9)- l w k = 1 Tr (W fc WZ - 

Therefore, defining the matrix A = (a^), with 

v 7 ^ ^ v^(i)v a (j) 



a 



a °<x 



8 



and the vector x = \/N\Vk = (x\, . . . , Xjv) (this is a vector in C N , whose components are order one 
random variables), we find 



i Wk - -Tr (W k W£ 



<E \w* k (W k W k - 
and therefore (taking into account also (|5.3p ) 



ij 



P(|«fc| >e) < 



'J 



> e 



Observe that 



TV AM 



-y 



l 



In Lemma 15.11 we show that, up to an event with probability at most e c ( lo g Ar ) 6/4 ) 



TrA*A < C 



Nrj 



Therefore, Proposition 12.21 implies that 



in fc |>e)<C7e- c ( lo ^ 6/4 + C7e- CB V^ 



and thus 
(5.4) 



max 10 J > e 

k=l,...,N 



C£* — =L 

e V ve 



after adjusting the constants. We restrict now our attention to the event |f2&| < e for all k = 1, . . . , N. 
To complete the proof of Theorem [21 we use a continuity argument. We fix k > 0, and consider 
= E + irj £ C with < £ < 4 - k, < 7/ < kE 1 , Nr]/\/E > (log iV) b . We connect 9 with the point 
0o = 2 + 2i/-c. Let L denote the line segment connecting 9 and 9q. Note that, on L, (i£ — 2) 2 + 77 2 < 4 
always holds. Hence, Lemma 12.11 implies that, on L, Im A > Co/\/E for a constant Co depending 
only on k (Co can be chosen as const • k 1//2 /(1 + k 2 ) 1 / 4 ). 

We claim now that, if |Ajv — A| < Cq/(2\/E) somewhere on L, then |Ajv — A| < Ce/y/E where 
C depends only on k. In fact, Im A > Cq/\^E and \Aj\r — A| < Cq/{2\/E) imply that Im Ajv > 
C /{2y/E). This implies (for e < C /4) that Im (Ajv + (n k /\/E)) > C /(4 V A E). Hence gives 



1 



IN 



< 



C*VE 



9(A N + 1) 

Subtracting the fixed point equation A + 1/(6>(A + 1)) = 0, we find 



(Ajv - A) 1 + 



A 



Atv + 1 



< 



8s 



cIVe 



Therefore 

|Ajv - A| < 

< 



8s 



Atv + 1| 



Cq\/E |Ajv + A + 1| 
' 1(\A N + 1\ >2|A|). 



Atv + 11 



< 



Cl^E 

16e / J_ 
Cgv^l + Co 



Aat + A + II 



+ 1(|Ajv + 1| < 2|A| 



Atv + 11 



A N + A + II 



where we used that, on L, Im A > Cq/\/E and |A| < l/\/E (see Lemma l2.ip . Theorem [5] follows 
because, from [19], |Ajv(2 + 2in) — A(2 + 2in)\ < Cq/{2\/2) for N large enough. This completes the 
proof of Theorem [2j 



E \ ^ m I E 1 > CiVE 



Lemma 5.1. Let Xn = {xij/vN) be an N x M matrix as defined in M.l\) . Denote by s a the 
eigenvalues of X^Xn ■ Assume the real and imaginary part of the entries Xij are iid random variables 
with a common bounded probability density function. Let 6 = E + in, with rj < E and Nn/^fE > 
(\.ogN) h . Then there exist constants C\,C2,c > with 

Proof. We clearly have 

p(—X^ 1 r VE_ \ I E_ v 

\ iV 2 2^ la _ ^12 - Cl Nr] I - P l N 2 1^ \s a -e\ 2 '- ?> Nri 

m ( E x 1 C X VE 
+ p \ > — 

\ iV 2 ^ \s a -6\ 2 ~ 3 Nri 

\ a:s a >£/2 1 a 1 ' / 

We start by controlling the term I. To this end, note that 

w E R^ s w^ [MlogAr)VAr21 

o^^logAOV^ 2 

where, as usual, jV[a, 6] denotes the number of eigenvalues of X* N X^ in the interval [a, b]. Hence, by 
Prop.SU 

I < P (V[0, (log N) b /N 2 ] > y 7§) ^ Ce ~°^ ^ Ce~< lo ^ b 

for appropriate constants C, c > 0. Next, we consider the term II. We have 

E ^ 1 E ^M[2- k E,2- k+1 E\ 

N 2 ^ |s a -9\ 2 E 2 

a:s a £[(logN) b /N 2 ,E/2] 1 1 fc=2 

where /co > is chosen as the smallest integer with 2~ k °E < (log N) b /N 2 . From Theorem[TJ it follows 
that, for a sufficiently large K > 0, 

Af[2- k E,2~ k+1 E] < CN2- k / 2 VE 

up to an event of probability at most exp(— c{2~ k ^ 2 N\fE) 1 / 2 ). This implies that, apart from an event 
of total probability bounded by 

^ e -cy/2-*/*NVE < ko e -c^2- k o/2NVE < ^ e -c(log TV) 6 / 4 < e -(c/2)(log A^ 4 
fc=2 

we have 

E_ v 1 g A 2- fc / 2 iVv^ ^_ 

iV 2 ^ |s a -#| 2 - iV 2 ^ E 12 " Ny/E~ iVr? 

where we used the assumption rj < E. Finally, we have to control the term III. To this end, we 
observe that 

E 1 E ^ 1 , £ ^ JV 4 



iV 2 ^ |s a -0| 2 - iV 2 (s a - E) 2 + ?7 2 - iV 2 2 2fe 77 2 
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)k„ rp _ nk-\^\ I I \TP. _i_ ofc-l 



where we set I k = J k D [E/2, oo), with J fc = [£ - 2 fe r/, E - 2 fc - 1 ??] U [E + 2 fe - 1 r?, £ + 2^77] for A; > 1 and 
Jo = [E — 7j, E + 77]. Observe that, by Theorem [TJ 

Mu < C? N * 



/E 

up to an event with probability at most 



e v ^ < e 



Therefore, 



E_ x ^ 1 < A V 2kNr] < C— 

N 2 ^ \s a - 6\ 2 ~ N 2 ^ 2 2k n 2 VE ~ Nn 

a:s a >E/2 1 a 1 fc>0 c '' V ^ ' 



apart from an event with probability at most e c ( lo s N ) b ^ i _ This completes the proof of the lemma. □ 

6. Delocalization: Proof of Theorem d] 

Denote by {s a }^ =1 the eigenvalues of the matrix X* N X^ with s\ < ... < sat and by {u Q }^ =1 the 
corresponding set of orthonormal eigenvalues. From the equation X^X]yu a = s a u a and the condition 
||u Q || 2 = 1 it follows that (see also [251 Cor. 25] and DUE]) 

|u Q (A;)| 2 = - 
1 _l J_ v-^ 1 s fl K Xfc l 

I — (k) (k) 

where = viVw^ and w/% denotes the k-th column of the matrix X/v, while and are the 
eigenvalues and the corresponding eigenvectors of the matrix W^W k , where the matrix W k is obtained 
by removing the fc-th column from the matrix Xjy. For arbitrary r/ > 0, we have 



\u a (k)\ < 



I (k) 



Taking r/ = \/ r s ^° g I ^' 1 , Theorem [3] implies that 

{sf ] 6 ls«,sa+il}}\ >C(logN) b 
up to an event with probability smaller than e _c ^ log N ^ b . Prop. 12.31 implies then that 

P-8p E[s a ,8 a +Tj\ 

apart from an event with probability smaller than e — c Qo&N) . This implies that 

p(|u a (A:)| 2 >^^)<C e -^^) b 
Taking the maximum over k, Theorem 0] follows. 
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